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Abstract — The role of multiple antennas for secure com- 
munication is investigated within the framework of Wyner's 
wiretap channel. We characterize the secrecy capacity in terms of 
generalized eigenvalues when the sender and eavesdropper have 
multiple antennas, the intended receiver has a single antenna, and 
the channel matrices are fixed and known to all the terminals, 
and show that a beamforming strategy is capacity-achieving. In 
addition, we show that in the high signal-to-noise (SNR) ratio 
regime the penalty for not knowing eavesdropper's channel is 
small — a simple "secure space-time code" that can be thought 
of as masked beamforming and radiates power isotropically 
attains near-optimal performance. In the limit of large number of 
antennas, we obtain a realization-independent characterization of 
the secrecy capacity as a function of the number f3: the number 
of eavesdropper antennas per sender antenna. We show that the 
eavesdropper is comparatively ineffective when /3 < 1, but that 
for p > 2 the eavesdropper can drive the secrecy capacity to zero, 
thereby blocking secure communication to the intended receiver. 
Extensions to ergodic fading channels are also provided. 

Index Terms — Wiretap channel, cryptography, multiple an- 
tennas, MIMO systems, broadcast channel, secrecy capacity, 
masked beamforming, artificial noise, generalized eigenvalues, 
secure space-time codes. 



I. Introduction 

MULTIPLE-ELEMENT antenna arrays are finding grow- 
ing use in wireless communication networks. Much 
research to date has focused on the role of such arrays in 
enhancing the throughput and robustness for wireless commu- 
nication systems. By contrast, this paper focuses on the role 
of such arrays in a less explored aspect of wireless systems — 
enhancing security. Specifically, we develop and optimize 
physical layer techniques for using multiple antennnas to 
protect digital transmissions from potential eavesdroppers, and 
analyze the resulting performance characteristics. 

A natural framework for protecting information at the 
physical layer is the so-called wiretap channel introduced 
by Wyner [1] and associated notion of secrecy capacity. In 
the basic wiretap channel, there are three terminals — one 
sender, one receiver and one eavesdropper. Wyner's original 
treatment established the secrecy capacity for the case where 
the underlying broadcast channel between the sender and the 
receiver and eavesdropper is a degraded one. Subsequent work 
generalized this result to nondegraded discrete memoryless 
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broadcast channels [2], and applied it to the basic Gaussian 
channel [3]. 

Motivated by emerging wireless communication applica- 
tions, there is growing interest in extending the basic Gaussian 
wiretap channel to the case when the terminals have multiple 
antennas; see, e.g., [4]-[12] and the references therein. While 
in principle the secrecy capacity for such nondegraded broad- 
cast channels is developed in [2] by Csiszar and Korner, the 
solution is in terms of an optimized auxiliary random variable 
and has been prohibitively difficult to explicitly evaluate. 
Thus, such characterizations of the solution have not proved 
particularly useful in practice. 

In this paper, we investigate practical characterizations for 
the specific scenario in which the sender and eavesdropper 
have multiple antennas, but the intended receiver has a single 
antenna. We refer to this configuration as the multi-input, 
single-output, multi-eavesdropper (MISOME) case. It is worth 
emphasizing that the multiple eavesdropper antennas can cor- 
respond to a physical multiple-element antenna array at a 
single eavesdropper, a collection of geographically dispersed 
but perfectly colluding single-antenna eavedroppers, or related 
variations. 

We first develop the secrecy capacity when the complex 
channel gains are fixed and known to all the terminals. A 
novel aspect of our derivation is our approach to (tightly) 
upper bounding the secrecy capacity for the wiretap channel. 
Our result thus indirectly establishes the optimum choice of 
auxiliary random variable in the secrecy capacity expression 
of [2], addressing an open problem. 

While the capacity achieving scheme generally requires 
that the the sender and the intended receiver have knowledge 
of the eavesdropper's channel (and thus number of anten- 
nas as well) — which is often not practical — we futher show 
that performance is not strongly sensitive to this knowledge. 
Specifically, we show that a simple masked beamforming 
scheme described in [4], [5] that does not require knowledge 
of the eavesdropper's channel is close to optimal in the high 
SNR regime. 

In addition, we examine the degree to which the eaves- 
dropper can drive the secrecy capacity of the channel to zero, 
thereby effectively blocking secure communication between 
sender and (intended) receiver. In particular, for Rayleigh 
fading in the large antenna array limit, we use random matrix 
theory to characterize the secrecy capacity (and the rate 
achievable by masked beamforming) as a function of the ratio 
of the number of antennas at the eavesdropper to that at the 
sender. Among other results in this scenario, we show that 1) 
to defeat the security in the transmission it is sufficient for 
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the eavesdropper to use at least twice as many antennas as 
the sender; and 2) an eavesdropper with significantly fewer 
antennas than the transmitter is not particularly effective. 

Our results extend to the case of time- varying channels. We 
focus on the case of fast (ergodic, Rayleigh) fading, where the 
message is transmitted over a block that is long compared to 
the coherence time of the fading. In our model the state of the 
channel to the receiver is known by all three parties (sender, 
receiver, and eavesdropper), but the state of the channel to the 
eavesdropper is known only to the eavesdropper. Building on 
techniques developed for the single transmitter antenna wiretap 
problems [8], [9], we develop upper and lower bounds on the 
secrecy capacity both for finitely many antennas and in the 
large antenna limit. 

As a final comment, we note that the idea of protecting 
information at the physical layer (rather than the application 
layer) is not a conventional approach in contemporary cryp- 
tography. Indeed, the common architecture today has the lower 
network layers focus on providing a noiseless public bit-pipe 
and the higher network layers focus on enabling privacy via the 
exchange and distribution of encryption keys among legitimate 
parties prior to the commencement of communication. As 
discussed in [7], [9], for many emerging applications, existing 
key distribution methods are difficult to exploit effectively. In 
such cases, physical-layer mechanisms such as those devel- 
oped in this paper constitute a potentially attractive alternative 
approach to providing transmission security. 

The organization of the paper is as follows. Section II 
summarizes some convenient notation used in the paper and 
some mathematical preliminaries. Section III describes the 
channel and system model of interest. Section IV states all 
the main results of the paper. The proofs of our results appear 
in subsequent sections and the more technical details are pro- 
vided in the Appendices. Section V provides an alternate upper 
bound while Section VI provides the secrecy capacity. Our 
analysis of the masked beamforming scheme is provided in 
Section VII while the scaling laws of the secrecy capacity and 
the masked beamforming scheme are provided in section VIII. 
The extension to ergodic fading channels with only intended 
receiver's channel state information is treated in Section IX 
and Section X contains some concluding remarks. 

II. Notation 

Bold upper and lower case characters are used for matrices 
and vectors, respectively. Random variables are distinguished 
from realizations by the use of san-serif fonts for the former 
and seriffed fonts for the latter. And we generally reserve the 
symbols / for mutual information, H for entropy, and h for 
differential entropy. All logarithms are base-2 unless otherwise 
indicated. 

The set of all n-dimensional complex-valued vectors is 
denoted by C n , and the set of m x n-dimensional matrices 
is denoted using C mxrl . Matrix transposition is denoted using 
the superscript T , and the Hermitian (i.e., conjugate) transpose 
of a matrix is denoted using the superscript t. Moreover, 
Null(-) denotes the null space of its matrix argument, and 
tr(-) and det(-) denote the trace and determinant of a matrix, 



respectively. The notation A y means that A is a positive 
semidefinite matrix and we reserve the symbol I to denote 
the identity matrix, whose dimensions will be clear from the 
context. 

A sequence of length n is either denoted by {a;(i)}™ =1 or 
sometimes more succinctly as x n ; in addition, we sometimes 
need notation the x\ for a sequence Xi, Xi + i, . . . , Xj. 

Finally, CK(0, K) denotes a zero-mean circularly- 
symmetric complex Gaussian distribution with covariance K, 
and we use the notation {•}+ = max(0, •) throughout the 
paper. 



A. Preliminaries: Generalized Eigenvalues 

Many of our results arise out of generalized eigenvalue anal- 
ysis. We summarize the properties of generalized eigenvalues 
and eigenvectors we require in the sequel. For more extensive 
developments of the topic, see, e.g., [13], [14]. 

Definition 1 (Generalized eigenvalues): For a Hermitian 
matrix A e C nxn and positive definite 1 matrix B e C™ x ", 
we refer to (A, tp) as a generalized eigenvalue-eigenvector pair 
of (A, B) if (A, ip) satisfy 



(1) 



Since B in Definition 1 is invertible, first note that gener- 
alized eigenvalues and eigenvectors can be readily expressed 
in terms of regular ones. Specifically, 

Fact 1: The generalized eigenvalues and eigenvectors of the 
pair (A, B) are the regular eigenvalues and eigenvectors of the 
matrix B _1 A. 

Other characterizations reveal more useful properties for our 
development. For example, we have the following: 

Fact 2 (Variational Characterization): The generalized 
eigenvectors of (A, B) are the stationary point solution 
to a particular Rayleigh quotient. Specifically, the largest 
generalized eigenvalue is the maximum of the Rayleigh 
quotient 2 



A max (A, B) = max 



(2) 



and the optimum is attained by the eigenvector corresponding 
to A max (A,B). 

The case when A has rank one is of special interest to us. 
In this case, the generalized eigenvalue admits a particularly 
simple expression: 

Fact 3 (Quadratic Form): When A in Definition 1 has rank 
one, i.e., A = aa^ for some a e C™, then 



A max (aa t ,B) = a^^a. 



(3) 



'When B is singular, we replace A with a pair (a, (3) that satisfies 
f3Aif> = aBi/>- A solution for which a ^ and (3 = corresponds to 
an infinite eigenvector. Generalized eigenvalues and eigenvectors also arise in 
simultaneous diagonalization of (A,B) [13]. 

throughout the paper we use A max to denote the largest eigenvalue. 
Whether this is a regular or generalized eigenvalue will be clear from context, 
and when there is a need to be explicit, the relevant matrix or matrices will 
be indicated as arguments. 
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III. Channel and System Model 

The MISOME channel and system model is as follows. 
We use n t and n c to denote the number of sender and 
eavesdropper antennas, respectively; the (intended) receiver 
has a single antenna. The signals observed at the receiver and 
eavesdropper, respectively, are, for t = 1, 2, . . ., 

y T {t) = hjx(t) + z r (t) 
y (t)=H x(t)+z e (t), 

where x(t) £ C nt is the transmitted signal vector, h r £ C lt 
and H c £ C™ oXnt are complex channel gains, and z r (i) and 
z c (t) are independent identically-distributed (i.i.d.) circularly- 
symmetric complex-valued Gaussian noises: z r (i) ~ SN(0, 1) 
and z c (t) <~ CK(0, 1). Moreover, the noises are independent, 
and the input satisfies an average power constraint of P, i.e., 



(4) 



E 



1 



E 

*=i 



< p. 



(5) 



Finally, except when otherwise indicated, all channel gains are 
fixed throughout the entire transmission period, and are known 
to all the terminals. 

Communication takes place at a rate R in bits per channel 
use over a transmission interval of length n. Specifically, a 
(2 nR ,n) code for the channel consists of a message w uni- 
formly distributed over the index set W„ = {1,2,..., 2 nR }, 
an encoder [i n : W„ — > C" tX " that maps the message w to 
the transmitted (vector) sequence {x(i)}™ =1 , and a decoding 
function v n : C" — > W„ that maps the received sequence 
{t/ r (t)}™ =1 to a message estimate w. The error event is £„ = 
{^(^(w)) 7^ w}, and the amount of information obtained 
by the eavesdropper from the transmission is measured via the 
equivocation I(w; y"). 

Definition 2 (Secrecy Capacity): A secrecy rate R is 
achievable if there exists a sequence of (2 nR , n) codes such 
that Pr(£„) — > and I(w; y")/n — > as n — > oo. The secrecy 
capacity is the supremum of all achievable secrecy-rates. 

Note that our notion of secrecy capacity follows [l]-[3] in 
requiring a vanishing per-symbol mutual information for the 
eavesdropper's channel (hence the normalization by n in Def- 
inition 2). Practically, this means that while the eavesdropper 
is unable to decode any fixed fraction of the message bits, it 
does not preclude the possibility of decoding a fixed number 
(but vanishing fraction) of the message bits. 

Maurer and Wolf [15] (see also [16]) have observed that 
for discrete memoryless channels, the secrecy capacity is not 
reduced even when one imposes the stronger requirement that 
7(w;y") — > as n — > oo. However, we remark in advance 
that it remains an open question whether a similar result holds 
for the Gaussian case of interest in this work. 

IV. Main Results 

The MISOME wiretap channel is a nondegraded broadcast 
channel. In Csiszar and Korner [2], the secrecy capacity of the 
nondegraded discrete memoryless broadcast channel p yr ^ c \ x is 
expressed in the form 

C = max I(u;y T ) - I(u;y c ), (6) 

Pu,Px\u 



where u is an auxiliary random variable over a certain alphabet 
that satisfies the Markov relation u <-» x <-» (y r ,y c ). Moreover, 
the secrecy capacity (6) readily extends to the continuous 
alphabet case with a power constraint, so it also gives a 
characterization of the MISOME channel capacity. 

Rather than attempting to solve for the optimal choice of u 
and p x \ u in (6) directly to evaluate this capacity, 3 we consider 
an indirect approach based on a useful upper bound as the 
converse, which we describe next. We note in advance that, as 
described in [10], our upper bound has the added benefit that 
it extends easily to the MIMOME case (i.e., when the receiver 
has multiple antennas). 

A. Upper Bound on Achievable Rates 

A key result is the following upper bound, which we derive 
in Section V. 

Theorem 1: An upper bound on the secrecy capacity for the 
MISOME channel model is 

R+ = min max R + (K.p, Ka), (7) 
where i? + (K P ,K0) = 7(x;y r |y e ) with x - SN(0, K P ) and 



3Ci 



and where 



with 



K, 



K P h 0, tr(Kp) < P \, 



SN(O,K ) 



(8) 



(9) 
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(10) 



To obtain this bound, we consider a genie-aided channel in 
which the eavesdropper observes y c but the receiver observes 
both y r and y c . Such a channel clearly has a capacity larger 
than the original channel. Moreover, since it is a degraded 
broadcast channel, the secrecy capacity of the genie-aided 
channel can be easily derived and is given by (cf. [1]) 
max/(x; y r |y c ) where the maximum is over the choice of 
input distributions. As we will see, it is straightforward to 
establish that the maximizing input distribution is Gaussian 
(in contrast to the original channel). 

Next, while the secrecy capacity of the original channel 
depends only on the marginal distributions p y ^ x and p yc | x (see, 
e.g., [2]), mutual information J(x;y r |y e ) for the genie-aided 
channel depends on the joint distribution Py r , yo | x - Accordingly 
we obtain the tightest such upper bound by finding the 
joint distribution (having the required marginal distributions), 
whence (7). 

The optimization (7) can be carried out analytically, yielding 
an explicit expression, as we now develop. 

3 The direct approach is explored in, e.g., [11] and [12], where the difficulty 
of performing this optimization is reported even when restricting p x |„ to be 
singular (a deterministic mapping) and/or the input distribution to be Gaussian. 
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B. MISOME Secrecy Capacity 

The upper bound described in the preceding section is 
achievable, yielding the MISOME channel capacity. Specif- 
ically, we have the following theorem, which we prove in 
Section VI-A. 

Theorem 2: The secrecy capacity of the channel (4) is 

C(P) = {logA max (l + Ph r h r t,I + PHtH c )} + , (11) 

with A max denoting the largest generalized eigenvalue of its 
argument pair. Furthermore, the capacity is obtained by beam- 
forming (i.e., signaling with rank one co variance) along the 
direction ip max of the 4 generalized eigenvector corresponding 
to A max with an encoding of the message using a code for the 
scalar Gaussian wiretap channel. 

We emphasize that the beamforming direction in Theorem 2 
for achieving capacity will in general depend on all of the 
target receiver's channel h r , the eavesdropper's channel H c , 
and the SNR (P). 

In the high SNR regime, the MISOME capacity (11) ex- 
hibits one of two possible behaviors, corresponding to whether 

lim C(P) = {logA max (h r h r t,HtH c )} + , (12) 

P — >oo 

is finite or infinite, which depends on whether or not h r has a 
component in the null space of H c . Specifically, we have the 
following corollary, which we prove in Section VI-B. 

Corollary 1: The high SNR asymptote of the secrecy ca- 
pacity (11) takes the form 

lim C(P) = {logA max (h r h r t,HtH )}+<oo ifH^h r = 0, 

P^oo 

(13a) 
112 if H^h r ^0, 



lim [C(P)-logP]=log||H e ^h r || 

P— >oo 



(13b) 

where denotes the projection matrix onto the null space 
ofH c . 5 

This behavior can be understood rather intuitively. In par- 
ticular, when H^hr = 0, as is typically the case when the 
eavesdropper uses enough antennas (n c > n t ) or the intended 
receiver has an otherwise unfortunate channel, the secrecy 
capacity is SNR-limited. In essence, while more transmit 
power is advantageous to communication to the intended 
receiver, it is also advantageous to the eavesdropper, resulting 
in diminishing returns. 

By contrast, when H^hr ^ 0, as is typically the case 
when, e.g., the eavesdropper uses insufficiently many antennas 
(n e < n t ) unless the eavesdropper has an otherwise unfor- 
tunate channel, the transmitter is able to steer a null to the 
eavesdropper without simultaneously nulling the receiver and 
thus capacity grows by 1 b/s/Hz with every 3 dB increase in 
transmit power as it would if there were no eavesdropper to 
contend with. 

The MISOME capacity (11) is also readily specialized to 
the low SNR regime, as we develop in Section VI-C, and takes 
the following form. 

4 If there is more than one generalized eigenvector for A max , we choose 
any one of them. 

5 That is, the columns of constitute an orthogonal basis for the null 
space of H c . 



Corollary 2: The low SNR asymptote of the secrecy capac- 
ity is 

Urn ^1 = I ^L{A max (h r h r t - Ht Hc )}+. (14) 

In this low SNR regime, the direction of optimal beamform- 
ing vector approaches the (regular) eigenvector corresponding 
to the largest (regular) eigenvalue of h r hj — HjH . Note 
that the optimal direction is in general not along h r . 6 Thus, 
ignoring the eavesdropper is in general not an optimal strategy 
even at low SNR. 



C. Eavesdropper-Ignorant Coding: Masked Beamforming 

In our basic model the channel gains are fixed and known to 
all the terminals. Our capacity-achieving scheme in Theorem 2 
uses the knowledge of H c for selecting the beamforming 
direction. However, in many applications it may be difficult 
to know the eavesdropper's channel. Accordingly, in this 
section we analyze a simple alternative scheme that uses 
only knowledge of h r in choosing the transmit directions, yet 
achieves near-optimal performance in the high SNR regime. 

The scheme we analyze is a masked beamforming scheme 
described in [4], [5]. In this scheme, the transmitter signals 
isotropically (i.e., with a covariance that is a scaled identity 
matrix), and as such can be naturally viewed as a "secure 
space-time code." More specifically, it simultaneously trans- 
mits the message (encoded using a scalar Gaussian wiretap 
code) in the direction corresponding to the intended receiver's 
channel h r while transmitting synthesized spatio-temporal 
white noise in the orthogonal subspace (i.e., all other direc- 
tions). 

The performance of masked beamforming is given by the 
following proposition, which is proved in Section VII-A. 

Proposition 1 (Masked Beamforming Secrecy Rate): A 
rate achievable by the masked beamforming scheme for the 
MISOME channel is 



Rmb{P) — 

logA max f-h r h r t,I + -HtH c )+log (l- 

\n t Th J V 



P\\h T 



(15) 



While the rate (15) is, in general, suboptimal, it asymp- 
totically near-optimal in the following sense, as developed in 
Section VII-B. 

Theorem 3: The rate Rmb(P) achievable by masked beam- 
forming scheme for the MISOME case [cf. (15)] satisfies 



lim 

P^oo 



C I — ) -Rmb(P) 



= 0. 



(16) 



From the relation in (16) we note that, in the high SNR 
regime, the masked beamforming scheme achieves a rate 
of C(P/n t ), where n t is the number of transmit antennas. 
Combining (16) with (13), we see that the asymptotic masked 

6 The optimal direction is h r in some special cases, such as if h r happens 
to be an eigenvector of H^H C . The latter happens when, e.g., the n t columns 
of H c are orthogonal and have the same norm. 
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beamforming loss is at most lognt b/s/Hz, or equivalently 
10 log 10 n t dB in SNR. Specifically, 



Hm [C(P)-R MB (P)} = l0gnt ' H l h ^° 



(17) 



That at least some loss (if vanishing) is associated with the 
masked beamforming scheme is expected, since the capacity- 
achieving scheme performs beamforming to concentrate the 
transmission along the optimal direction, whereas the masked 
beamforming scheme uses isotropic inputs. 

As one final comment, note that although the covariance 
structure of the masked beamforming transmission does not 
depend on the eavesdropper's channel, the rate of the base 
(scalar Gaussian wiretap) code does, as (15) reflects. In 
practice, the selection of this rate determines an insecurity 
zone around the sender, whereby the transmission is secure 
from eavesdroppers outside this zone, but insecure from ones 
inside. 

D. Example 

In this section, we illustrate the preceding results for a 
typical MISOME channel. In our example, there are n t = 2 
transmit antennas, and n c = 2 eavesdropper antennas. The 
channel to the receiver is 

h r = [0.0991 + j0.8676 1.0814 - jl.1281] T , 

while the channel to the eavesdropper is 

H c .i 



0.3880 
0.4709 - 



jl.2024 -0.9825 + jO. 5914 
j0.3073 0.6815- jO. 2125 



(18) 



where j = 

Fig. 1 depicts communication rate as a function of SNR. 
The upper and lower solid curves depict the secrecy capacity 
(11) when the eavesdropper is using one or both its antennas, 
respectively. 7 As the curves reflect, when the eavesdropper has 
only a single antenna, the transmitter can securely communi- 
cate at any desired rate to its intended receiver by using enough 
power. However, by using both its antennas, the eavesdropper 
caps the rate at which the transmitter can communicate se- 
curely regardless of how much power it has available. Note 
that the lower and upper curves are representative of the cases 
where H^hr is, and is not 0, respectively. 

Fig. 1 also shows other curves of interest. In particular, using 
dotted curves we superimpose the secrecy capacity high-SNR 
asymptotes as given by (13). As is apparent, these asymptotes 
can be quite accurate approximations even for moderate values 
of SNR. Finally, using dashed curves we show the rate (15) 
achievable by the masked beamforming coding scheme, which 
doesn't use knowledge of the eavesdropper channel. Consistent 
with (17), the loss in performance at high SNR approaches 3 
dB when the eavesdropper uses only one of its antennas, and 
dB when it uses both. Again, these are good estimates of the 
performance loss even at moderate SNR. Thus the penalty for 
ignorance of the eavesdropper's channel can be quite small in 
practice. 

7 When a single eavesdropper antenna is in use, the relevant channel 
corresponds to the first row of (18). 




Fig. 1. Performance over an example MISOME channel with nt = 2 transmit 
antennas. The successively lower solid curves give the secrecy capacity for 
n c = 1 and n c = 2 eavesdropper antennas, respectively and the dotted 
curves indicat the corresponding high-SNR asymptote. The dashed curves 
give the corresponding rates achievable by masked beamforming, which does 
not require the transmitter to have knowledge of the eavesdropper's channel. 



E. Scaling Laws in the Large System Limit 

Our analysis in Section IV-B of the scaling behavior of 
capacity with SNR in the high SNR limit with a fixed number 
of antennas in the system yielded several useful insights into 
secure space-time coding systems. In this section, we develop 
equally valuable insights from a complementary scaling. In 
particular, we consider the scaling behavior of capacity with 
the number of antennas in the large system limit at a fixed 
SNR. 

One convenient feature of such analysis is that for many 
large ensembles of channel gains, almost all randomly drawn 
realizations produce the same capacity asymptotes. For our 
analysis, we restrict our attention to an ensemble correspond- 
ing to Rayleigh fading in which h r and H are independent, 
and each has i.i.d. CN(0, 1) entries. The realization from the 
ensemble is known to all terminals prior to communication. 

In anticipation of our analysis, we make the dependency 
of secrecy rates on the number of transmit and eavesdropper 
antennas explicit in our notation (but leave the dependency 
on the realization of h r and H c implicit). Specifically, we 
now use C(P,nt,n c ) to denote the secrecy capacity, and 
Rmb(P, nt,n c ) to denote the rate of the masked beamforming 
scheme. With this notation, the scaled rates of interest are 



C(j,/3)= lim C (P = ~f/n t ,n t ,n c = /3n t ) , 



and 



Rmb{i,P) 



lim i?MB(-P = 

nt — >oo 



= 7,nt,n c =/3n t ). 



(19a) 



(19b) 



Our choice of scalings ensures that the C(^,(3) and 
Rmb(^,P) are not degenerate. In particular, note that the 
capacity scaling (19a) involves an SNR normalization. In 
particular, the transmitted power P is reduced as the number 
of transmitter antennas n t grows so as to keep the received 
SNR remains fixed (at specified value 7) independent of n t . 
However, the scaling (19b) is not SNR normalized in this way. 
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This is because the masked beamforming already suffers a 
nominal factor of n t SNR loss [cf. (16)] relative to a capacity- 
achieving system. 

In what follows, we do not attempt an exact evaluation of the 
secrecy rates for our chosen scalings. Rather we find compact 
lower and upper bounds that are tight in the high SNR limit. 

We begin with our lower bound, which is derived in 
Section VIII-B. 

Theorem 4 (Scaling Laws): The asymptotic secrecy capac- 
ity satisfies 

(7( 7 , /3) >'{log£(7,/3)} + , (20) 

where 

£(7,j9) = 




Furthermore, the same bound holds for the corresponding 
asymptotic masked beamforming rate, i.e., 

A MB (7,/3) > {log£(7,/3)} + - (22) 
Since the secrecy rates increase monotonically with SNR, 
the infinite-SNR rates constitute a useful upper bound. As 
derived in Section VIII-C, this bound is as follows. 
Theorem 5: The asymptotic secrecy capacity satisfies 

(5( 7 , 0) < lim lim C(P,nt,Pnt) 

rit^oo P— too 

!0 > 2 

- logOS -1) 1<0<2 (23) 

oo /3<1. 

Furthermore, the right hand side of (23) is also an upper bound 

on Rmb{i,P), i-e., 

-Rmb(7>^)< 1™ li m RMB{P,n t , (3n t ) 

Tl t — >00 P^OO 

= C(oo,p) (24) 

Note that it is straightforward to verify that the lower bound 
(20) is tight at high SNR, i.e., that, for all 0, 

{log£(oo,/3)} + = <7(oc,/3). (25) 

The same argment confirms the corresponding behavior for 
masked beamforming. 

Our lower and upper bounds of Theorem 4 and Theorem 5, 
respectively, are depicted in Fig. 2. In particular, we plot rate 
as a function of the antenna ratio /3 for various values of the 
SNR 7. 

As Fig. 2 reflects, there are essentially three main regions 
of behavior, the boundaries between which are increasingly 
sharp with increasing SNR. First, for < 1 the eavesdropper 
has proportionally fewer antennas than the sender, and thus is 
effectively thrwarted. It is in this regime that the transmitter 
can steer a null to the eavesdropper and achieve any desired 
rate to the receiver by using enough power. 

Second, for 1 < < 2 the eavesdropper has proportionally 
more antennas than the sender, and thus can cap the secure rate 




0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 
P 



Fig. 2. Secrecy capacity bounds in the large system limit. The solid red 
curve is the high SNR secrecy capacity, which is an upper bound on the 
for finite SNR. The progressively lower dashed curves are lower bounds on 
the asymptotic secrecy capacity (and masked beamforming secrecy rate). The 
channel realizations are fixed but drawn at random according to Gaussian 
distribution. 

achievable to the receiver regardless of how much power the 
transmitter has available. For instance, when the transmitter 
has 50% more antennas than the eavesdropper (0 = 1.5), the 
sender is constrained to a maximum secure rate no more than 1 
b/s/Hz. Moreover, if the sender is sufficiently limited in power 
that the received SNR is at most, say, 10 dB, the maximum 
rate is less than 1/2 b/s/Hz. 

We emphasize that these results imply the eavesdropper is at 
a substantial disadvantage compared to the intended receiver 
when the number of tranmitter antennas is chosen to be large. 
Indeed, the intended receiver needs only a single antenna to 
decode the message, while the eavesdropper needs a large 
number of antennas to constrain the transmission. 

Finally, for > 2 the eavesdropper is able to entirely 
prevent secure communication (drive the secrecy capacity to 
zero) even if the transmitter has unlimited power available. 
Useful intuition for this phenomenon is obtained from con- 
sideration of the masked beamforming scheme, in which the 
sender transmits the signal of interest in the direction of h r and 
synthesized noise in the n% — 1 directions orthogonal to h r . 
With such a transmission, the intended receiver experiences 
a channel gain of ||h r || 2 P/rt t . In the high SNR regime, 
the eavesdropper must cancel the synthesized noise, which 
requires at least n t — 1 receive antennas. Moreover, after 
canceling the noise it must have the "beamforming gain" of 
n t so its channel quality is of the same order as that of 
the intended receiver. This requires having at least n t more 
antennas. Thus at least 2rit — 1 antennas are required by 
the eavesdropper to guarantee successful interception of the 
transmission irrespective of the power used, which corresponds 
to > 2 as n t — ► 00. 

F. Capacity Bounds in Fading 

Thus far we have focused on the scenarios where the 
receiver and eavesdropper channels are fixed for the duration 
n of the message transmission. In this section, we briefly 
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turn our attention to the case of time-varying channels — 
specifically, the case of fast fading where there are many 
channel fluctuations during the course of transmission. In 
particular, we consider a model in which h r (t) and H e (i) are 
temporally and spatially i.i.d. sequences that are independent 
of one another and have 6N(0, 1) elements, corresponding to 
Rayleigh fading. 

In our model, h r (i) is known (in a causal manner) to all 
the three terminals, but only the eavesdropper has knowledge 
of H c (t). Accordingly, the channel model is, for i = 1,2,..., 



yi (t) = h$(t)x(t)+z r (t) 

y e (t) = H e (t)x(t) + Ze(t). 



(26) 



The definition of the secrecy rate and capacity is as in 
Definition 2, with the exception that the equivocation I(w; y") 
is replaced with I(w; y™, H™|h"), which takes into account the 
channel state information at the different terminals. 

For this model we have the following nontrivial upper and 
lower bounds on the secrecy capacity, which are developed in 
Section IX. The upper bound is developed via the same genie- 
aided channel analysis used in the proof of Theorem 2, but 
with modifications to account for the presence of fading. The 
lower bound is achieved by the adaptive version of masked 
beamforming described in [4]. 

Theorem 6: The secrecy capacity for the MISOME fast 
fading channel (26) is bounded by 

C FF (P,n t ,n e ) > max E [iJ FF ,_(h r , H e , />(•))] , (27a) 
p(-)ea>FF 

C FF (P,n t ,n e ) < max E [R FFt+ (h T , H e , p(-))] , (27b) 

p(-)63>FF 

where y FF is the set of all valid power allocations, i.e., 



?ff = {p(-) | P(-) > 0, E[p(h r )} < P}, 



(28) 



and 



fi F F,-(h r ,H e ,p(-)) = 

log ( ^hl 



+ 



log (1 



I + ^HtH e 

rh 



(29a) 



V P(hr)||h r ||V ' 
i?FF,+ (hr,H e ,p(-)) = 

{log A max (I + p(h r )h r hj, I + p(h r )HtH c )} + , 

(29b) 

In general, our upper and lower bounds do not coincide. 
Indeed, even in the case of single antennas at all terminals 
(n t = n c = 1), the secrecy capacity for the fading channel is 
unknown, except in the case of large coherence period [8]. 

However, based on our scaling analysis in Section IV-E, 
there is one regime in which the capacity can be calculated: 
in the limit of both high SNR and a large system. Indeed, 
since (22) and (23) hold for almost every channel realization, 
we have the following proposition, whose proof is provided 
in Section IX-C. 



Proposition 2: The secrecy capacity of the fast fading chan- 
nel satisfies 

lim C FF (P = 7 ,n t ,n c = /3n t )>{log£( 7 ,/3)} + , (30) 

nt— ► oo 

where £(•, •) is as defined in (21), and 

lim C FF {P = -y,n t ,n e = (3n t )<C(oc,t3) (31) 

n t ^oo 

with the C(oo,(3) as given in (23). 

Finally, via (25) we see that (30) and (31) converge as 7 — > 00. 

This concludes our statement of the main results. The 
following sections are devoted to the proofs of these results 
and some further discussion. 

V. Upper Bound Derivation 

In this section we prove Theorem 1. We begin with the 
following lemma, which establishes that the capacity of genie- 
aided channel is an upper bound on the channel of interest. 
A proof is provided in Appendix I, and closely follows the 
general converse of Wyner [1], but differs in that the latter 
was for discrete channels and thus did not incorporate a power 
constraint. 

Lemma 1: An upper bound on the secrecy capacity of the 
MISOME wiretap channel is 



C < max/(x;y r |y c ), 



(32) 



where 7 is the set of all probability distributions that satisfy 
E[\W]<P- 

Among all such bounds, we can choose that corresponding 
to the noises (z r ,z e ) being jointly Gaussian (they are already 
constrained to be marginally Gaussian) with a covariance 
making the bound as small as possible. Then, provided the 
maximizing distribution in (32) is Gaussian, we can express 
the final bound in the form (7) 

It thus remains only to show that the maximizing distribu- 
tion is Gaussian. 

Lemma 2: For each e X^, the distribution p x maxi- 
mizing J(x;y r |y e ) is Gaussian. 
Proof: Since 

J( x ;yr|y e ) = h(y T \y e ) - h(z r \z e ), 

and the second term does not depend on p x , it suffices to 
establish that h(y T |y e ) is maximized when x is Gaussian. 

To this end, let £*LMMSEy e denote the linear minimum 
mean-square error (MMSE) estimator of y r from y e , and 
Almmse the corresponding mean-square estimation error. Re- 
call that 

cklmmse = (hjKpHt + 0t)(I + HeKpHt)- 1 , (33) 
Almmse = 1 + hjKph r 

-(htKpHt + 0t)(i +He KpHl)- 1 (<73+H c Kph r ) 

(34) 

depend on the input and noise distributions only through their 
(joint) second-moment characterization, i.e., 



covx, 



"1 


4>r 






4 


= cov 




1 







(35) 
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Proceeding, we have 

Myr|y c ) = h(y T - a L MMSEy e |ye) (36) 

< h(y T - a LM MSEy e ) (37) 

< log27reA LM MSE, (38) 

where (36) holds because adding a constant doesn't change 
entropy, (37) holds because conditioning only reduces differ- 
ential entropy, and (38) is the maximum entropy bound on 
differential entropy expressed in terms of 



vare = A LM mse, 
where e is the estimation error 

e = (y r - C*LMMSEye 



(39) 



(40) 



It remains only to verify that the above inequalities are tight 
for a Gaussian distribution. To see this, note that (37) holds 
with equality when x is Gaussian (and thus (y r , y c ) are jointly 
Gaussian) since in this case e is the (unconstrained) MMSE 
estimation error and is therefore independent of the "data" y . 
Furthermore, note that in this case (38) holds with equality 
since the Gaussian distribution maximizes differential entropy 
subject to a variance constraint. 



VI. MISOME Secrecy Capacity Derivation 

In this section we derive the MISOME capacity and its high 
and low SNR asymptotes. 

A. Proof of Theorem 2 

Achievability of (11) follows from evaluating (6) with the 
particular choices 

u~e^(0,P), x = i/> maxU , (41) 

where t/> max is as defined in Theorem 2. With this choice of 
parameters, 

I(u;y T ) - I(u;y e ) 

= 7(x;y r )-/(x;y e ), (42) 
= log (1 + P|h r t^ max | 2 ) - log (1 + P||H c </> n 



-lug ^nax( I + - Ph r h r t )V'max 

V>Lx(I + ^HiH e )Vw 
= logA max (I + Ph r h r t,I + PHtH e ), 



(43) 



(44) 



where (42) follows from the fact that x is a deterministic 
function of u, (43) follows from the choice of x and u in 
(41), and (44) follows from the variational characterization of 
generalized eigenvalues (2). 

We next show a converse — that rates greater than (11) are 
not achievable using our upper bound. Specifically, we show 
that (11) corresponds to our upper bound expression (7) in 
Theorem 1. 

It suffices to show that a particular choice of 4> that is 
admissible (i.e., such that K^, e %<p) minimizes (7). We can 
do this by showing that 



with the chosen <fi corresponds to (11). 

Since only the first term on the right hand side of 

i?+(K P ,K0) = J(x;y r |y e ) = ft(y r |y e ) - h(z T \z e ) 

depends on Kp, we can restrict our attention to maximizing 
this first term with respect to Kp. 

Proceeding, exploiting that all variables are jointly Gaus- 
sian, we express this first term in the form of the optimization 

h(y T \y e ) = min h(y r - f y e ) (46) 
= min h((h T - H^)t x + z r - f z e ) 
= min log[(h r - Ht0)tK P (h r - H+0) 



+ 1 + 



2Re{0 t 0}] : 



and bound its maximimum over Kp according to 

max h(y T \y e ) 

= max mm log[(h r - Ht0)t Kp (h r - H^) 

+ l + ||0j| 2 -2Rc{0 t </>}] 
< min max Iog[(h r - H|0)+K P (li r - H+0) 

+ l + ||0|| 2 -2Rc{6/ t 0}] 
= mm log[P||h r -Ht6>|| 2 + l + ||6>|| 2 -2Rc{6>V}], 

(47) 

where (47) follows by observing that a rank one Kp maxi- 
mizes the quadratic form (h r - Hj0)tKp(h r - H|0). 

Note that directly verifying that rank one covariance max- 
imizes the term h(y T \y e ) appears difficult. The above elegant 
derivation between (46) and (47) was suggested to us by 
Yonina C. Eldar and Ami Wiesel. In the literature, this line 
of reasoning has been used in deriving an extremal character- 
ization of the Schur complement of a matrix (see e.g., [17, 
Chapter 20], [18]). 

We now separately consider the cases A max > 1 and A max < 

1. 

Case: A max > 1: We show that the choice 

Ho'0 max 







(48) 



in (45) yields (11), i.e., logA max . 

We begin by noting that since A max > 1, the variational 
characterization (2) establishes that ||0|| < 1 and thus K<£ G 
3C<£ as defined in (10). 

Then, provided that, with <p as given in (48), the right hand 
side of (47) evaluates to 

mm log[P||h r -Ht6>|| 2 + l + ||6>|| 2 -2Rc{0 t 0}] 

= log(A max -(l-||0|| 2 )), (49) 



we have 



max JLfKp.KJ 



(45) 



R+ < max R + (Kp,Kd>) 

KpEXp v 

= max h(y T \y e ) - ft(z r |z e ) 

<log(A max .(l-||0H 2 ))-log(l-!|0|| 2 ) 
= log(A max ), 
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i.e., (11), as required. Verifying (49) with (48) is a straight- 
forward computation, the details of which are provided in 
Appendix II. 

Case: A max < 1, H c full column rank: We show that the 
choice 

= HeCHtHe)- 1 !!, (50) 

in (45) yields (11), i.e., zero. 

To verify that ||0|| < 1, first note that since A max 1, it 
follows from (2) that 

A max (I+Ph r h r t,I+PHtH e ) < 1 «• A max (h r ht,H e H e ) < 1, 

(51) 

so that for any choice of ip, 



(52) 



Choosing ip = (HtHe)- 1 ^ in (52) yields ||0|| 2 < ||0||, i.e., 
\\4>\\ < 1, as required. 

Next, note that (47) is further upper bounded by choosing 
any particular choice of 9. Choosing 9 = (f> yields 



R+ < log 



+ 1 



(53) 



which with the choice (50) for <fr is zero. 

Case: A max < 1, H c not full column rank: Consider a new 
MISOME channel with n[ < n t transmit antennas, where n[ 
is the column rank of H c , where the intended receiver and 
eavesdropper channel gains are given by 



gr 



Q f h r , 



H C Q, 



(54) 



and where Q is a matrix whose columns constitute an orthog- 
onal basis for the column space of H|, so that in this new 
channel G has full rank. 

Then provided the new channel (54) has the same capacity 
as the original channel, it follows by the analysis of the 
previous case that the capacity of both channels is zero. Thus 
it remains only to show the following. 

Claim 1: The MISOME channel (g r , G e ) corresponding to 
(54) has the same secrecy capacity as that corresponding to 
(h r ,H e ). 

Proof: First we show that the new channel capacity is 
no larger than the original one. In particular, we have 

A max (I + Pg r gt,I + PGtG e ) 



max 



1 + P|gt-0'| 2 

{^1*1=1} U + P||G C <//|| 2 

l + P|h r tQt/>'| 2 
max 7— - 

«•':!!</•' Il=i} 1 + P||H C Q<//|| 2 

1 + P|h r +<0| 2 



max 



(55) 
(56) 
(57) 

{*:W=1} ll+P||H ^|| 2 ' 

= A max (I + Ph r h r t, I + PHtH e ), (59) 

where to obtain (55) we have used (2) for the new channel, to 
obtain (56) we have used (54), to obtain (57) we have used that 
QtQ = I, to obtain (58) we have used that we are maximizing 



< 



{^:^=Q^', ||^||=l} 1 +P||H l/>|| 2 
1 + P|ht^| 2 



max 



over a larger set, and to obtain (59) we have used (2) for the 
original channel. Thus, 

{A max (I + Pg r gt,I + PGtG c )} + 

< {A max (I + Ph r h r t,I + PHtH c )} + , (60) 

Next, we show the new channel capacity is no smaller than 
the original one. To begin, note that 



Null(H ) C Null(hJ), 



(61) 



since if Null(H e ) g Null(h|), then A max (h r h|, H|H C ) = oo, 
which would violate (51). 

Proceeding, every x e C" 1 can we written as 



x = Qx' + x, 



(62) 



where H c x = and thus, via (61), hjx = as well. Hence, 
we have that hjx = gjx', H e x = G c x', and ||x'|| 2 < ||x|| 2 , 
so any rate achieved by p x on the channel (h r ,H ) is also 
achieved by p x > on the channel (g r , G c ), with p x > derived from 
p x via (62), whence 

{A max (I + Pg r gt,I + PGtG c )} + 

> {A max (I + PMiJ,! + PHtH c )} + . (63) 
Combining (63) and (60) establishes our claim. ■ 

B. Proof of Corollary 1 

We restrict our attention to the case A max > 1 where the 
capacity is nonzero. In this case, since, via (2), 



A max (I + Ph r h r t,I + PHtH c ) = - 



where 



we have 



l + P|ht^ max (P)| 2 
+ P||H ^ max (P)|| 2 



i + pHM 



t/wW = argmax pm \,,u, 

{V:||VII = 1} 1 +^ll tl o'»/'ll 



|h^ max (P)| > ||H c ^ max (P)| 



> 1, 

(64) 



(65) 



(66) 



for all P > 0. 

To obtain an upper bound note that, for all P > 0, 

A max (I + Ph r h r t,I + PHtH c ) 

|h r t^ max (P)| 2 
" ||H c ^ max (P)|| 2 
< A max (h r h r t,HtH c ), (68) 

where (67) follows from the Rayleigh quotient expansion (64) 
and the fact that, due to (66), the right hand side of (64) is 
increasing in P, and where (68) follows from (2). Thus, since 
the right hand side of (68) is independent of P we have 

lim A max (I + Ph r h r t, I + PHtH c ) < A max (h r h r t, H+H e ). 

P— >oo 

(69) 

Next, defining 



VVaxC 00 ) - argmax 



|h r t^| 2 
l|H ^|| : 



(70) 
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we have the lower bound 

lim A max (I + Ph r h r t,I + PHtH e ) 

P— »oo 



> lim 



1/p + lhtT/wMi 2 



p-oo l/P+||H c ^ max (oo)|| 2 

(71) 

= A max (h r h r t,HtH e ) (72) 

where (71) follows from (2) and (72) follows from (70). 

Since (69) and (72) coincide we obtain (12). Thus, to obtain 
the remainder of (13a) we need only verify the following. 

Claim 2: The high SNR capacity is finite, i.e., 
A max (h r h r t,HjH c ) < oo, when H^h r = 0. 

Proof: We argue by contradiction. Suppose 
A ma x(h r hJ, HjH c ) = oo. Then there must exist a sequence 
ip k such that ||H c i/> fc || > for each k = 1,2,..., but 
|H c t/> fc | — > as k — > oo. But then the hypothesis cannot 
be true, because, as we now show, |hJi/>| 2 /||H c i/>|| 2 , when 
viewed as a function of ip, is bounded whenever the 
denominator is nonzero. 

Let ip be any vector such that ||H c i/>|| = 5 > 0. It suffices 
to show that 

IhJVI 2 . Ilh r " 2 



< 



(73) 



l|Het/>|| 2 

where a 2 is the smallest nonzero singular value of H c . 
To verify (73), we first express ip in the form 

ip = ctp' + dxp, (74) 

where ip' an d *P are un it vectors, c and d are real and 
nonnegative, dxp is the projection of ip onto the null space 
of H , and ctp' is the projection of ip onto the orthogonal 
complement of this null space. 

Next, we note that 5 = \\H c ip\\ = c||H e i//|| > cct, whence 



c< 5 -. 
a 



(75) 



But since H^h, = it follows that h.\ip = 0, so 



|h r t^| 2 =c 2 |h r V| 2 < C 2 ||h r || 2 <^||h r || 2 , (76) 

where the first inequality follows from the Cauchy-Schwarz 
inequality, and the second inequality is a simple substitution 
from (75). Dividing through by 1 1 H c i/j 1 1 2 = S 2 in (76) yields 
(73). 

■ 

We now develop (13b) for the case where h r ^ 0. 
First, defining 



Soo = {^: ||V|| = l,||H e V>||=0} 
we obtain the lower bound 

iA max (I + Ph r h r t,I + PHtH e ) 

> max VP+\^\ 2 
- 1 + J-||H C ^||2 

= max - + \htip\ 2 
= i + ||H ( fh r || 2 , 



(77) 



(78) 
(79) 



where to obtain (79) we have used, 

max IhJ^I 2 = llH^hrll 2 . 

{V:||-</'ll = l:H (: i/>=0} 1 1 11 11 



(80) 



Next we develop an upper bound. We first establish the 
following. 

Claim 3: If H^hr ^ then there is a function e(P) such 
that e(P) -> as P —> oo, and 



|H c ^ max (P)|| < e(P). 



Proof: We have 

i + P||h r || 2 



> 



i + p|h r t^ max (P)| 5 



l + P||H c ^ max (P)|| 2 " l + P||H c ^ max (P)|| 2 

1 + P|h r t^| 2 



(81) 



> 



max 



{V>:H e v=o,||V>||=i} 1 + P||H O -0|| 2 

(82) 

max (l + P|h^| 2 ) 

(83) 



-Lu 1 1 2 



= l + P||H^h r 



where to obtain (81) we have used the Cauchy-Schwarz 
inequality |hJi/> max (P)| 2 < ||h r || 2 , to obtain (82) we have 
used (65), and to obtain (83) we have used (80). 
Rearranging (83) then gives 

as desired. ■ 

Thus with § P = {ip: \\ip\\ = 1, \\B. c ip\\ < e(P)} we have 

iA max (I + Ph r h r t,I + PHtH c ) 



= max 



l/P+|h r t-0| 2 

v-Ts; i + p||h c ^ii 2 

< max - + |h^| 2 , 



(84) 
(85) 



where (84) follows from (2) and Claim 3 that the maximizing 

'/'max lies in §p- 

Now, as we will show, 

max |h r tt/f < IIH^H 2 + ^^||h r || 2 . (86) 

so using (86) in (85) we obtain 

-^A max (I+Ph r hJ,I + PHtH c ) 



1 

P 



<||H> r! | 2 + ^||h r || 2 + 
Finally, combining (87) and (79) we obtain 

lim ^A^I + PMi^I + PHjHe) = HH^hJ 2 , 

P— >oo r 

whence (13b). 

Thus, it remains only to verify (86), which we do now. 
We start by expressing ip e Sp in the form [cf. (74)] 



(87) 



ip = cip' + dip, 



(88) 
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where ip' and xp are unit vectors, c, d are real valued scalars in 
[0, 1], dxj) is the projection of ip onto the null space of H c , and 
cip' is the projection of xp onto the orthogonal complement of 
this null space. 

With these definitions we have, 



e(P) > \\H c xp\\ = c\\B. c ip'\\ > co- 



rn 



since H c i/> = and ||H c i/>'|| > a. 
Finally, 



(90) 
(91) 



\\v\ip\ 2 = Idh^ + cht^'l 2 

= d 2 |h r ^| 2 + C 2 |h r V| 2 
<| h t^| 2 + £ffl|h r V| 2 (92) 

<|h r ^| 2 + ^||h r || 2 



< l|H c ± h r || 2 + 



e{Pf 



|h r || 2 , 



(93) 
(94) 



where (90) follows from substituting (88), (91) follows from 
the fact that tp' and xp are orthogonal, (92) follows from using 
(89) to bound c 2 , and (94) follows from the fact that H e i/> = 
and (80). 



C. Proof of Corollary 2 

We consider the limit P — > 0. In the following steps, the 
order notation O(P) means that 0(P)/P -> as P -> 0. 

A max (I + Ph r h r t,I + PHtH c ) (95) 
= A max ((I + PHtHe)- 1 ^ + Ph r hJ)) (96) 
= A max ((I - PHtH c + 0(P))(I + Ph r hJ)) (97) 
= A max ((I - PHtH e )(I + Ph r h r t)) + O(P) (98) 
= A max (l + P(h r h r t-HtH c ))+0(P) (99) 
= 1 + PA max (h r h r t - HjHe) + O(P), (100) 

where (96) follows from the definition of generalized eigen- 
value, (97) follows from the Taylor series expansion of (I + 
PHJH c ) _1 , where we have assumed that P is sufficiently 
small so that all eigenvalues of PHjH c are less than unity, 
(98) and (99) follow from the continuity of the eigenvalue 
function in its arguments and (100) follows from the property 
of eigenvalue function that A(I + A) = 1 + A(A). 
In turn, we have, 



C(P) _ log(l + PA max (h r h r t - H|Hg) + O(P)) 
P P 

_ A max (h r h r t - HtHe) O(P) 



In 2 



P ' 



(101) 
(102) 



where to obtain (101) we have used (100) in (11), and to 
obtain (102) we have used Taylor Series expansion of the ln(-) 
function. 

Finally, taking the limit P -> in (102) yields (14) as 
desired. 



VII. Masked Beamforming Scheme Analysis 

From Csiszar-Korner [2], secrecy rate R = I(u;y T ) — 
I(u;y e ) is achievable for any choice of p u and p x | u that 
satisfy the power constraint P[|x| 2 ] < P. While a capacity- 
achieving scheme corresponds to maximizing this rate over 
the choice of p u and p x \ u (cf. (6)), the masked beamforming 
scheme corresponds to different (suboptimal) choice of these 
distributions. In particular, we choose 

Pu = eK(0, P) and Px]u = e^(uh r , P(I - h r hj), (103) 

where we have chosen the convenient normalizations 

P 

(104) 



and 



h r = 



rh 

llhrl 



(105) 



In this form, the secrecy rate of masked beamforming is 
readily obtained, as we now show 

A. Proof of Proposition 1 

With p u and p x \ u as in (103), we evaluate (6). To this end, 
first we have 

I{u;y T ) =log(l + P||h r || 2 ) (106) 
Then, to evaluate I(u;y e ), note that 

h(y c ) = logdet(I + PH c Ht) 
h(y c \u) - logdet(I + PH C (I - h r ht)Ht) 

so 

I(u;y c ) 

= h(y c ) - h(y c \u) 

= logdet(I+PH e Ht)-logdet(I + PH e (I-h r ht)Ht) 
= logdct(I+PHtH c )-logdct(I + P(I-h r hJ)HtHe) 
= logdct(I + PHtH c ) 

- logdet(I + PHtHe-Ph.hjHjHc) 
= - log dot (i - Ph r h r tH+H c (I + PHtHe)- 1 ) 

= - log (l - PhjHtHe(I + PHtHe)- 1 ^) 

= - log f h r t(I + PHtHe)- 1 ^) , (107) 



where we have repeatedly used the matrix identity det(I + 
AB) = det(I + BA) valid for any A and B with compatible 
dimensions. 

Thus, combining (106) and (107) we obtain (15) as desired: 

Pmb(P) 

= I(u;y r ) - I(u;y e ) 

= log(l + P||h r || 2 ) + log(h r t(I + PHtHe)- 1 ^) 

1 +log(Ph r t(I + PHtH )- 1 h r ) 

+ log(A max (Ph r h r t ,I+PHtH c )), 



= log + - 
= log (l 



P||h r ||2 
1 



P\\hA 

where to obtain the last equality we have used the special form 
(3) for the largest generalized eigenvalue. 
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B. Proof of Theorem 3 where F 2nti 2n a -2n t +i is the F-distribution with 2n t and 2n c — 

First, from Theorem 2 and Proposition 1 we have, with 2n * + 1 degrees of freedom, i.e., 



again P as in (104) for convenience, 

d vi/(2n t ) 

C (L\ _ Rmb{P) < log A max (I + Ph ht ;I + P H tH c )_ F 2nt , nM1 - V2/(2nc _ 2nt + iy dl6) 



(108) where = denote equality in distribution, and where v\ and V2 
Next, with i/> max denoting the generalized eigenvector cor- are inde pendent chi-squared random variables with 2n t and 

responding to A max (I + Ph r ht, I + PHJHe), we have 2nc _ 2 n t + 1 degrees of freedom, respectively. 

_ , l + PWJib I 2 Using Fact 6 it follows that with 3 = n c /n t fixed, 

A max (i + p hr h r t, i + p H t Hc ) = 1 - ¥ ; 

(109) lim A max (h r hJ,HtH c ) = — — , when /3 > 1. (117) 



A max (Ph r h r t, I + PHtH c ) > P l h ^ m -I 2 



0- 



1 + P||H c i/> max || 2 Indeed, from the strong law of large numbers we have that the 

(110) random variables v\ and v 2 in (116) satisfy, for (3 > 1, 

(HI) ,. Vi a.s l/ 2 a.s. 1 , 11Q , 

hm - — = 1, and lim — — — — — = 1 (118) 
Finally, substituting (109) and (1 10) into (108), we obtain n ^°° 2n t n ^°° 2n t{P ~ 1) + 1 

/ P \ ( ru \ Combining ( 1 1 8) with ( 1 1 6) yields (117). 

o< C (-)- BMB( P,o„ g ^ + ^^], ( n 2) 

the right hand side of which approaches zero as P — > oo, B Proof of Theorem 4 
whence (16) as desired. 



First, from Theorem 2 we have that 



VIII. Scaling Laws Development 



C(P, n t ,n e ) = {log A max (I + Ph r hJ, I + PH+H C )} + 



We begin by summarizing a few well-known results from 

random matrix theory that will be useful in our scaling laws; > {log A max (Ph r hJ, I + PHjH c )} + 

for further details, see, e.g., [19]. = {jog ^ + pH t Ho) -i hr ) } + , ( n 9) 

A. Some Random Matrix Properties where (119) follows from the quadratic form representation 

Three basic facts will suffice for our purposes. (3) of the generalized eigenvalue. 

Fact 4: Suppose that v is a random length-n complex vector Rewriting (119) using the notation 

with independent, zero-mean, variance- 1/n elements, and that 

B is a random n x n complex positive semidefinite matrix _ _J_h r and H c = - H e (120) 

distributed independently of v. Then if the spectrum of B ^Jnt y/nt 
converges we have 



lim v+(I + 7B) _1 w = mil), (H3) 



we then obtain (20) as desired: 

C( 7 ,/3) = C( 7 /n t ,n t ,/?n t ) 



where t/b(7) is the ^-transform [19] of the matrix B. 

Of particular interest to us is the ^-transform of a special > /log + 7HjH c )~ 1 h r ^ | 

class of matrices below. a s + 

Fact 5: Suppose that H s <C KxN is random matrix whose ~^ { lo s£(7, P)} as n t ^ oo, (121) 
entries are i.i.d. with variance 1/N. As K , N — > oo with the 

ratio K/N 4 3 fixed, the ry-transform of B = H^H is given where t0 obtain ( 121 > we have a PP lied ( 113 ) and ( 114 >- 

hy The derivation of the scaling law (22) for the masked 

/ n £(t> P) beamforming scheme is analogous. Indeed, from Proposition 1 

r? H t H (7) = — — (H4) wehaye 

where £(•, •) is as defined in (21). r _ ~.~ -i + 

The distribution of generalized eigenvalues of the pair #mb(7, n u fin*) > jlog A max ( 7 h r h r , I + 7 H C H C ) j- 

(h r hj, HjH ) is also known [20], [21]. For our purposes, the _ f / itfCi \-iu ~\ \ + 

following is sufficient. ' ' ~ V° g V 7 r ( + ^" c) hr ) J 

Fact 6: Suppose that h r and H c have i.i.d. SN(0, 1) entries, a,s, > {log£(7, (3)} + as n t — > oo, 

and n c > n t . Then 

. . 2n t where as above the last line comes from applying (113) and 

Amax(h r h;, H^H C ) ~ 2n - 2n t + l ^ 2 "*' 2n "- 2 "t +1 ' ^ 115 ^ (114). 
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C. Proof of Theorem 5 

When (3 < 1 (i.e., n c < n t ), we have H^h,. ^ almost 
surely, so (13b) holds, i.e., 

lim C(P) = oo (122) 

P^oc 

as (23) reflects. 

When [5 > 1 (i.e., n > n t ) HjH c is nonsingular almost 
surely, (13a) holds, i.e., 

lim C(P) = {logA(h r ht,H e H e )} + . 

P — >oo 

Taking the limit n c , n t —* oo with n c /n t — (3 fixed, and using 
(117), we obtain 

lim lim C(P) = {- log(/3 - 1)}+ 

as (23) asserts. 

Furthermore, via (16) we have that 

lim Rmb(P) = {logA(h r h r t,HtH e )} + = lim C(P), 

whence (24). 

IX. Fading Channel Analysis 

We prove the lower and upper bounds of Theorem 6 
separately. 

A. Proof of (27a) 

By viewing the fading channel as a set of parallel channels 
indexed by the channel gain h r of the intended receiver 8 and 
the eavesdropper's observation as (y e , H c ), the rate 

R = I{u;y v | h r )-/( u; y c ,H c | h r ). (123) 

is achievable for any choice of p u \ hi and p x \ u ^ r that satisfies 
the power constraint E[pi\\ r )] < P. We choose distributions 
corresponding to an adaptive version of masked beamforming, 
i.e., [cf. (103)] 

p u]hr = GN(0, /5(h r )), p AuMi = e>T (uh r , p(h r )(I - h r h r t)) , 

(124) 

where we have chosen the convenient normalizations [cf. (104) 
and (105)] 

A P(hr) 



P(hr) 



nt 



and 



(125) 



(126) 



Evaluating (123) with the distributions (124) yields (27a) 
with (29a): 

J(u;y r |h r )-/(u;y e ,H e | h r ) (127) 
= £[log(l + / 5(h r )||h r || 2 )] 

+ Bpog(hJ(I + ^(hOHjHe)- 1 ^)] (128) 
1 



=^[ l0 H 1+ P(h r )iih r p, 

+ J B[log( / 5(h r )h r t(I + / 5(h r )HtH c )- 1 h r )] , (129) 

8 Since the fading coefficients are continuous valued, one has to discretize 
these coefficients before mapping to parallel channels. By choosing appropri- 
ately fine quantization levels one can approach the rate as closely as possible. 
See e.g., [9] for a discussion. 



where the steps leading to (128) are analogous to those used 
in Section VII-A for the nonfading case and hence have been 
omitted. 

B. Proof of (27b) 

Suppose that there is a sequence of (2 nR , n) codes such 
that for a sequence e n (with e n — > as n — > oo), 



1 



1 



H ( w ) ~ MY" > H e . K) < £n, 



Pr(w ^ w) < £„. 



(130) 



1) An auxiliary channel: We now introduce another chan- 
nel for which the noise vaiables z T (t) and z e (t) are correlated, 
but the conditions in (130) still hold. Hence any rate achievable 
on the original channel is also achievable on this new channel. 
In what follows, we will upper bound the rate achievable for 
this new channel instead of the original channel. 

We begin by introducing some notation. Let, 



p t (hl)±E[\\x(t)\\ 2 \hl = hl] 



(131) 



denote the transmitted power at time t, when the channel 
realization of the intended receiver from time 1 to t is h*. Note 
that pt(-) satisfies the long term average power constraint i.e., 



E h 



n 



t=i 



< P. 



(132) 



Next, let, ph r and ph c denote the density functions of h r and 
H c , respectively, and let p Zi and p Zo denote the density function 
of the noise random variables in our channel model (26). 
Observe that the constraints in (130) (and hence the capacity) 
depend only on the distributions p z?i h^,Hj (z™ , h™, H") and 
Pz«,h™(z™, h"). Furthermore since the channel model (26) is 
memoryless and (h r , H e ) are i.i.d. and mutually independent, 
we have 

p z? , h? , H? (z;\h?,H c ") = 

n 

II ( Z - (*) K ( h r (*) )PH C (H e (t)), (133) 
t=l 

n 

Pz?M? (z?,K) = H Pzr (z r (t))p hl (h T (t)). (134) 
t=l 

Let ft denote the set of conditional-joint distributions 
f> Zr (t),z e (t)|h",H n with fixed conditional-marginals, i.e., 

^ = {Pz r (t),z c (t)|h ? ,H ? (Zr,Z e | h",H^ 1 ) | 

Pz r (t)\K,H«(z r I h",H^ 1 ) =p ZT (z r ), 

p Ze (t)|h ? ,H ? (ze|K,H:)=p Ze (z e )}. (135) 

Suppose that for each t = 1, 2, . . . , n we select a distribution 
Pz r (t),z e (t)|h?,Hj S ft and consider a channel with distribution 

n 

J[Pz r ( t )? e (t)\h ? H S (?r(t),Ze(t)\K ,H c > hr (h r (t))pH. (He (*)) • 
t=l 

(136) 
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This new channel distribution has noise variables (z r (i), z e (t)) 
correlated, where the correlation is possibly time-dependent, 
but from (135) and (136), note that z™ and z™ are marginally 
Gaussian and i.i.d., and satisfy (133) and (134). Hence the 
conditions in (130) are satisfied for this channel and the rate 
R is achievable. 

In the sequel we select p Zr (t),z e (t)|h» h» ( z r, z e I h",H") 
to be the worst case noise distribution for the Gaussian 
channel with gains h r (i), and, H c (t), and power of pt(h*) in 
Theorem 2 i.e., if ip t is the eigenvector corresponding to the 
largest generalized eigenvalue A max (I+/9 t (h*)h r (£)h r (t)t, 1+ 

ft(h*)Ht(t)He(t)), 



Pz,(t),z.(t)|h»,H» = 



1 

4>t 



4>\ 
I 



= Jl^w*)^)' 

' (GeWfcKOGeW)- 1 ^^), A : 



, where 

Amax — 1? 

< l, 



(137) 



and where G e (t) and g r (i) are related to H c (i) and h r (i) 
as in (54). Our choice of p Zr (t),z e (t)|h",Hy is such that 
(z r (i),z e (i)) only depend on the (H e (t), h r (t), p t (h*)) i.e., 

(H?, h?) - (p(hj), h r (i), H e (i)) - (z r (i), Ze(t)) (138) 

forms a Markov chain. 

2) Upper bound on the auxiliary channel: We now upper 
bound the secrecy rate for the channel (136). Note that this 
also upper bounds the rate on the original channel. 

From Fano's inequality, that there exists a sequence s' n such 
that e' n — > as n — > oo, and, 

±H(w\y?,h?)<e' n . 

nR = H(w)=I(w;y? \h?)+ne' n 

= I(w;y? | h?) - I(w; y e ", H" | h?) + n(e„ + e' n ) 



(139) 



</("/;y r n |h? 
<^(x";y r " I h; 



,H^y»)+n(e n + 
,H e ",y; l )+n( £ „ + 4) 



(140) 
•<), (141) 



<£j(x(t);*(t) | H?,h?,y e (t))+n( er , 
t=i 

where (139) follows from the secrecy condition (c.f. (130)), 
and (140) follows from the Markov relation w <-» 
(x r \y",h",H™) <-» y r ", and (141) holds because for the 
channel (136) we have 



/ l (y r "|y e " ) H^h^x")=^/ l (y r (t)|y e (i),h^H^x(t)). 



t=i 

We next upper bound the term /(x(t); y r (i) | y c (t), H™, h") 
in (141) for each t = 1,2, ... ,n. 

J(x(t);*(t)|y e (t),H?,h?) 

</(x(i);y r (t) |ye(t),H (t),h r (t),pt(hJ)) (142) 
<£;[{logA max (I + p t (h*)h r (i)h r t(t), 

I + p t (h*)Ht(f)H e (i))} + ], (143) 
where (142) follows from the fact that (c.f. (138)), 

(H?, h?) - (x(t), Pt (h*), h r (t), He(t)) - (y r (t), y e (i)) 



forms a Markov chain and (143) follows since our choice of 
the noise distribution in (137) is the worst case noise in (7) 
for the Gaussian channel with gains h r (i), H c (t) and power 
/9t(h*), hence the derivation in Theorem 2 applies. 
Substituting (143) into (141) we have, 

nR - n(e n + e' n ) 

n 

= 5^£H.(t ) ,h}[{logA m «(I + p t (hJ)h r (t)ht(t), 

*=i 

I + Pt (h*)Ht(t)H c (i))} + ] (144) 

n 

< ^H e( t),h t( t) [{log A max (I + [ft(hj)]h r (t)ht(t), 

t=i 

I + E ht - 1 [p t (hl)}Ht(t)H e (t))} + ] (145) 

n 

= ]T ^H e( t),h r( t) [{log A max (I + ft(h r (t))h r (t)ht(t), 

t=l 

I + /) t (h r (t))Ht(t)H c (t))} + ] (146) 

n 

= J] ^H ,h r [{log A max (I + /5 t (h r )h r hJ, I + i 5 t (h r )HtH c )}+ 

t=i 

(147) 

n 1 

< n£H e ,h r [{logA max (I + ^-/3 t (h r )h r hJ, 

t=i n 

n 

I + E-^(MHlH c )} + ] (148) 

t=i 

- nE Hc . hr [{log A max (I + p(h r )h r hj, I + p(h r )HtH e )}+] 

(149) 

where (145) and (148) follow from Jensen's inequality since 
C{P) = {logA max (I + Ph r hJ,I + PHtH c )}+ is a capacity 
and therefore concave in P, (146) follows by defining 



p t (h r ) = [ Pt (h*)], 



(150) 



(147) follows from the fact that the distribution of both h r 
and H does not depend on t, and (149) follows by defining 

To complete the proof, note that 



i ™ 

= -Y, E k\pM)] (151) 

t=i 

1 ™ 

= - y)£4,»[ft(hj)] <P, (152) 

where (151) follows from (150) and the fact that the channel 
gains are i.i.d., and (152) follows from (132). 

C. Proof of Proposition 2 

The proof is immediate from Theorems 4, 5 and 6. 
For the lower bound, we only consider the case when 
log£(P, /3) > 0, since otherwise the rate is zero. We select 
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p(h r ) = P to be fixed for each h r . Then we have from 
Theorem 4 that 

P FF ,_(h r ,H e ,P)^log£(P,/3). 

Finally since almost-sure convergence implies convergence in 
expectation, 

lim £[P FF; _(h r ,H e ,P)] =log£(P,/3), 

n t — >oo 

which establishes the lower bound (30). For the upper bound, 
since 

i? FF ,+(h r , H c , P) = {logA max (I+Ph r ht,I+ PH+H C )} + , 
we have from Theorem 5 that 

lim P FFi+ (h r ,H e ,P)<'C(oo,/?), (153) 

and hence 

lim C FF (P = 7, n t ,n c = (3n t ) < lim P[P FF ,+ (h r , H e , 7)] 

<C(oo,P), 

where we again use the fact that almost sure convergence 
implies convergence in expectation. 

X. Concluding Remarks 

The present work characterizes the key performance char- 
acteristics and tradeoffs inherent in communication over the 
MISOME channel. There are many opportunities for further 
work. As one example, stronger results (i.e., tighter bounds) 
for the fast fading case would be quite useful. As another 
example would be extending the results to the general MI- 
MOME channel. For the latter, the high SNR regime has been 
characterized [10] using generalized singular value analysis, 
and the details will be reported elsewhere. 

More generally, many recent architectures for wireless sys- 
tems exploit the knowledge of the channel at the physical 
layer in order to increase the system throughput and reliability. 
Many of these systems have a side benefit of providing 
security. It is naturally of interest to quantify these gains and 
identify potential applications. 

XI. Acknowledgement 

Yonina C. Eldar and Ami Wiesel provided an elegant 
justification that rank one covariance maximizes the upper 
bound in Theorem 1, which appears between (46)-(47). 

Appendix I 
Proof of Lemma 1 

Suppose there exists a sequence of (2 nR , n) codes such that 
for every e > 0, and n sufficiently large we have that 

Pr(w ^ w) < e, (154) 

h(w;y?)<e, (155) 

1 " 

-5>[||x(z)|| 2 ]<P (156) 



We first note that (154) implies, from Fano's inequality, 

-I(w;y?) >R-e Fl (157) 
n 

where e F — > as e — > 0. Combining (155) and (157), we have 
for e' = e + e F : 

nR-ne' <I(w ]y ?)-I( W -r c ) 

<I(w,y?,y?)-I{w,y?) (158) 
= /(^;y r "|y; 1 ) (159) 

= h{y?\y n e ) - h(y?M, w) 
<h(y?M)-h(y?\yZ,w,x n ) (160) 
= h(y?\y:)-h(y?W,x n ) (161) 

n 

=Kyl l \)i:) -J2 h (yAt)\y c (t),m) (162) 

t=l 

n n 

< E h (yAt)\y c (t)) - ]T &Cy r (t)|y e (t), x(i)) 
t=i t=i 

= nl(x;y r \y c ,q) (163) 
<n/(x;y r |y c ), (164) 

where (158) and (159) each follow from the chain of mutual 
information, (160) follows from the fact that conditioning 
cannot increase differential entropy, (161) follows from the 
Markov relation w <-> (x",y™) <-> y", and (162) follows 
from the fact the channel is memoryless. Moreover, (163) is 
obtained by defining a time-sharing random variable q that 
takes values uniformly over the index set {1,2, ... ,n} and 
defining (x, y r ,y e ) to be the tuple of random variables that 
conditioned on q = t, have the same joint distribution as 
(x(t),y T (t),y e (t)). It then follows that for our choice of x and 
given (156), P[||x|| 2 ] < P. Finally, (164) follows from the fact 
that 7(x;y r |y e ) is concave in p x (see, e.g., [9, Appendix I] for 
a proof), so that Jensen's inequality can be applied. 

Appendix II 
Derivation of (49) 

The argument of the logarithm on left hand side of (49) 
is convex in 0, so it is straightforward to verify that the 
minimizing 6 is 

e= (I + PH c Ht)- 1 (PH c h r + 0). (165) 

In the sequel, we exploit that by the definition of generalized 
eigenvalues via (1), 

(I + Ph r h r t)^ max = A max (I + PHtH c )^ max , (166) 

or, rearranging, 

(h r h r t - A max HtH c ) </> max = (Ama ^~ 1} • </w (167) 
First we obtain a more convenient expression for as 
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follows: 



6= (I + PHeHj)- 1 PH c h r + 



1 

— r- H o '0 max 

hrV'max , 



(168) 



= (I + PH c Ht)- lHc(Ph ^ t+I) ^"- 

h rVmax 

= (I + PHeHt)- 1 A -Ho(PHt Hc + I)^ max (i69) 

= (I + PHeHt)- 1 Amax ' (PHcH - ± ___gft?_W (170) 

IWmax 



(171) 



where (168) follows from substituting (48) into (165), and 
(169) follows from substituting via (166). 
Next we have that 



*Wmax 

_ (h r h r t - A max HtH c )-0 D 



h r T t/v 
(A m ax - l)V> m 

^PhrVmax 



(172) 

(173) 
(174) 



where (172) follows from substituting from (171) with (48), 
and (173) follows by substituting (167). Thus, 



P||h r -Ht0|| 2 = (A max -l) 



1) 



P|h r T t/> D 



To simplify (175) further, we exploit that 

'0max H J H o'0 1] 



1-A max ||0|| 2 = 1-A 



V4ax h rhjt/W 

(h r h r t - A max HtH c )-0 



t-./, 12 
max I 



|h r T -0 



1) 



(175) 



(176) 



(177) 



where (176) follows by again substituting from (48), and (177) 
follows by again substituting from (167). In turn, replacing the 
term in brackets in (175) according to (177) then yields 

P||h r - Ht0|| 2 = (A max - 1)(1 - A max ||0|| 2 ). (178) 

Finally, substituting (178) then (171) into the left hand side 
of (49) yields, following some minor algebra, the right hand 
side as desired. 
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